NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

OpenRAND: A performance portable, reproducible random number generation library for parallel computations

https://doi.org/10.1016/j.softx.2024.101773

Khan, Shihab Shahriar; Palmer, Bryce; Edelmaier, Christopher; Aktulga, Hasan Metin (September 2024, SoftwareX)

Full Text Available
Amber free energy tools: Interoperable software for free energy simulations using generalized quantum mechanical/molecular mechanical and machine learning potentials

https://doi.org/10.1063/5.0211276

Tao, Yujun; Giese, Timothy J; Ekesan, Şölen; Zeng, Jinzhe; Aradi, Bálint; Hourahine, Ben; Aktulga, Hasan Metin; Götz, Andreas W; Merz, Kenneth M; York, Darrin M (June 2024, The Journal of Chemical Physics)

We report the development and testing of new integrated cyberinfrastructure for performing free energy simulations with generalized hybrid quantum mechanical/molecular mechanical (QM/MM) and machine learning potentials (MLPs) in Amber. The Sander molecular dynamics program has been extended to leverage fast, density-functional tight-binding models implemented in the DFTB+ and xTB packages, and an interface to the DeePMD-kit software enables the use of MLPs. The software is integrated through application program interfaces that circumvent the need to perform “system calls” and enable the incorporation of long-range Ewald electrostatics into the external software’s self-consistent field procedure. The infrastructure provides access to QM/MM models that may serve as the foundation for QM/MM–ΔMLP potentials, which supplement the semiempirical QM/MM model with a MLP correction trained to reproduce ab initio QM/MM energies and forces. Efficient optimization of minimum free energy pathways is enabled through a new surface-accelerated finite-temperature string method implemented in the FE-ToolKit package. Furthermore, we interfaced Sander with the i-PI software by implementing the socket communication protocol used in the i-PI client–server model. The new interface with i-PI allows for the treatment of nuclear quantum effects with semiempirical QM/MM–ΔMLP models. The modular interoperable software is demonstrated on proton transfer reactions in guanine-thymine mispairs in a B-form deoxyribonucleic acid helix. The current work represents a considerable advance in the development of modular software for performing free energy simulations of chemical reactions that are important in a wide range of applications.
more » « less
Full Text Available
Hybrid eigensolvers for nuclear configuration interaction calculations

https://doi.org/10.1016/j.cpc.2023.108888

Alperen, Abdullah; Aktulga, Hasan Metin; Maris, Pieter; Yang, Chao (November 2023, Computer Physics Communications)

Full Text Available
A Portable Sparse Solver Framework for Large Matrices on Heterogeneous Architectures

Rabbi, Fazlay; Daley, Christopher S.; Catalyurek, Umit V.; Aktulga, Hasan Metin (April 2023, 2022 IEEE 29th International Conference on High Performance Computing, Data, and Analytics (HiPC))

Full Text Available
Quantum Mechanics/Molecular Mechanics Simulations on NVIDIA and AMD Graphics Processing Units

https://doi.org/10.1021/acs.jcim.2c01505

Manathunga, Madushanka; Aktulga, Hasan Metin; Götz, Andreas W.; Merz, Kenneth M. (February 2023, Journal of Chemical Information and Modeling)

Full Text Available
A Portable Sparse Solver Framework for Large Matrices on Heterogeneous Architectures

https://doi.org/10.1109/HiPC56025.2022.00030

Rabbi, Fazlay; Daley, Christopher S.; Çatalyürek, Ümit V.; Aktulga, Hasan Metin (December 2022, International Conference on High Performance Computing, Data, & Analytics)

Full Text Available
High Performance Evaluation of Helmholtz Potentials Using the Multi-Level Fast Multipole Algorithm

https://doi.org/10.1109/TPDS.2022.3165649

Lingg, Michael P.; Hughey, Stephen M.; Shanker, Balasubramaniam; Aktulga, Hasan Metin (December 2022, IEEE Transactions on Parallel and Distributed Systems)

Full Text Available
JAX-ReaxFF: A Gradient-Based Framework for Fast Optimization of Reactive Force Fields

https://doi.org/10.1021/acs.jctc.2c00363

Kaymak, Mehmet Cagri; Rahnamoun, Ali; O’Hearn, Kurt A.; van Duin, Adri C.; Merz, Kenneth M.; Aktulga, Hasan Metin (September 2022, Journal of Chemical Theory and Computation)

Full Text Available
Optimizing Data Locality and Termination Criterion for t-SNE

https://doi.org/10.1109/IJCNN52387.2021.9534303

Dikbayir, Doga; Shanker, Balasubramaniam; Aktulga, Hasan Metin (July 2021, 2021 International Joint Conference on Neural Networks (IJCNN))

The t-Distributed Stochastic Neighbor Embedding (t-SNE) is known to be a successful method at visualizing high-dimensional data, making it very popular in the machine-learning and data analysis community, especially recently. However, there are two glaring unaddressed problems: (a) Existing GPU accelerated implementations of t-SNE do not account for the poor data locality present in the computation. This results in sparse matrix computations being a bottleneck during execution, especially for large data sets. (b) Another problem is the lack of an effective stopping criterion in the literature. In this paper, we report an improved GPU implementation that uses sparse matrix re-ordering to improve t-SNE's memory access pattern and a novel termination criterion that is better suited for visualization purposes. The proposed methods result in up to 4.63 x end-to-end speedup and provide a practical stopping metric, potentially preventing the algorithm from terminating prematurely or running for an excessive amount of iterations. These developments enable high-quality visualizations and accurate analyses of complex large data sets containing up to 10 million data points and requiring thousands of iterations for convergence.
more » « less
Full Text Available
An Evaluation of Task-Parallel Frameworks for Sparse Solvers on Multicore and Manycore CPU Architectures

https://doi.org/10.1145/3472456.3472476

Alperen, Abdullah; Afibuzzaman, Md; Rabbi, Fazlay; Ozkaya, M. Yusuf; Catalyurek, Umit; Aktulga, Hasan Metin (August 2021, ICPP 2021: 50th International Conference on Parallel Processing)

Recently, several task-parallel programming models have emerged to address the high synchronization and load imbalance issues as well as data movement overheads in modern shared memory architectures. OpenMP, the most commonly used shared memory parallel programming model, has added task execution support with dataflow dependencies. HPX and Regent are two more recent runtime systems that also support the dataflow execution model and extend it to distributed memory environments. We focus on parallelization of sparse matrix computations on shared memory architectures. We evaluate the OpenMP, HPX and Regent runtime systems in terms of performance and ease of implementation, and compare them against the traditional BSP model for two popular eigensolvers, Lanczos and LOBPCG. We give a general outline in regards to achieving parallelism using these runtime systems, and present a heuristic for tuning their performance to balance tasking overheads with the degree of parallelism that can be exposed. We then demonstrate their merits on two architectures, Intel Broadwell (a multicore processor) and AMD EPYC (a modern manycore processor). We observe that these frameworks achieve up to 13.7 × fewer cache misses over an efficient BSP implementation across L1, L2 and L3 cache layers. They also obtain up to 9.9 × improvement in execution time over the same BSP implementation.
more » « less
Full Text Available

« Prev Next »

Search for: All records